numerical differentiation
Black-Box Optimization with Local Generative Surrogates Supplementary Material A Surrogates Implementation Details A.1 GAN Implementation
Original version of FFJORD does not have a support of conditional input. To address this issue we rewrote one of the base layers that were used in FFJORD library. The example of such statistics is presented in Fig 6. Figure 6: An example of monitored statistics during surrogate training for one iteration of optimization. The gradient bias is calculated per component of the gradient vector, i.e., if C.1 Procedure For Mixing Matrix Generation 10-dimensional mixing matrix A could be generated with the following Python code: C.3 Numerical Derivatives To obtain numerical derivatives of R we are using central difference scheme: f Muons are bent by the magnetic field and simultaneously experience stochastic scattering as they pass through the magnet which causes random variations in their trajectories. Color represents number of the hits in a bin.
Learning Neural Antiderivatives
Rubab, Fizza, Nsampi, Ntumba Elie, Balint, Martin, Mujkanovic, Felix, Seidel, Hans-Peter, Ritschel, Tobias, Leimkühler, Thomas
In particular, we contrasted direct supervision based on antideriva-tive estimates with several method classes that rely on repeated differentiation, including both automatic and numerical approaches. As part of this framework, we explored integral reduction techniques as a means to mitigate the computational overhead associated with repeated integration or differentiation. Our systematic experimental analysis of all presented methods across a diverse set of modalities reveals that differential supervision via naïve automatic differentiation generally outperforms all competing approaches in terms of result quality. However, this performance comes at the cost of substantial training time, particularly for higher-order integration. A practical alternative in many scenarios is numerical differentiation via finite differences, which often achieves a reasonable trade-off between quality and computational cost. Nevertheless, it is important to incorporate a compensation operator to correct for the signal smoothing inherent in this differentiation scheme. Our findings suggest directions for future research, such as investigating progressive supervision paradigms that leverage different supervisory signals throughout the learning process. Interestingly, our analysis reveals that antiderivative quality correlates only weakly with downstream task performance. Cumulative schemes often excel at efficiently computing highly non-local aggregates, where local inaccuracies tend to cancel out.
Physics Informed Distillation for Diffusion Models
Tee, Joshua Tian Jin, Zhang, Kang, Yoon, Hee Suk, Gowda, Dhananjaya Nagaraja, Kim, Chanwoo, Yoo, Chang D.
Diffusion models have recently emerged as a potent tool in generative modeling. However, their inherent iterative nature often results in sluggish image generation due to the requirement for multiple model evaluations. Recent progress has unveiled the intrinsic link between diffusion models and Probability Flow Ordinary Differential Equations (ODEs), thus enabling us to conceptualize diffusion models as ODE systems. Simultaneously, Physics Informed Neural Networks (PINNs) have substantiated their effectiveness in solving intricate differential equations through implicit modeling of their solutions. Building upon these foundational insights, we introduce Physics Informed Distillation (PID), which employs a student model to represent the solution of the ODE system corresponding to the teacher diffusion model, akin to the principles employed in PINNs. Through experiments on CIFAR 10 and ImageNet 64x64, we observe that PID achieves performance comparable to recent distillation methods. Notably, it demonstrates predictable trends concerning method-specific hyperparameters and eliminates the need for synthetic dataset generation during the distillation process. Both of which contribute to its easy-to-use nature as a distillation approach for Diffusion Models.
Deep Learning and Machine Learning -- Python Data Structures and Mathematics Fundamental: From Theory to Practice
Chen, Silin, Bi, Ziqian, Liu, Junyu, Peng, Benji, Zhang, Sen, Pan, Xuanhe, Xu, Jiawei, Wang, Jinlang, Chen, Keyu, Yin, Caitlyn Heqi, Feng, Pohsun, Wen, Yizhu, Wang, Tianyang, Li, Ming, Ren, Jintao, Niu, Qian, Liu, Ming
This book provides a comprehensive introduction to the foundational concepts of machine learning (ML) and deep learning (DL). It bridges the gap between theoretical mathematics and practical application, focusing on Python as the primary programming language for implementing key algorithms and data structures. The book covers a wide range of topics, including basic and advanced Python programming, fundamental mathematical operations, matrix operations, linear algebra, and optimization techniques crucial for training ML and DL models. Advanced subjects like neural networks, optimization algorithms, and frequency domain methods are also explored, along with real-world applications of large language models (LLMs) and artificial intelligence (AI) in big data management. Designed for both beginners and advanced learners, the book emphasizes the critical role of mathematical principles in developing scalable AI solutions. Practical examples and Python code are provided throughout, ensuring readers gain hands-on experience in applying theoretical knowledge to solve complex problems in ML, DL, and big data analytics.
Embedding an ANN-Based Crystal Plasticity Model into the Finite Element Framework using an ABAQUS User-Material Subroutine
He, Yuqing, Heider, Yousef, Markert, Bernd
This manuscript presents a practical method for incorporating trained Neural Networks (NNs) into the Finite Element (FE) framework using a user material (UMAT) subroutine. The work exemplifies crystal plasticity, a complex inelastic non-linear path-dependent material response, with a wide range of applications in ABAQUS UMAT. However, this approach can be extended to other material behaviors and FE tools. The use of a UMAT subroutine serves two main purposes: (1) it predicts and updates the stress or other mechanical properties of interest directly from the strain history; (2) it computes the Jacobian matrix either through backpropagation or numerical differentiation, which plays an essential role in the solution convergence. By implementing NNs in a UMAT subroutine, a trained machine learning model can be employed as a data-driven constitutive law within the FEM framework, preserving multiscale information that conventional constitutive laws often neglect or average. The versatility of this method makes it a powerful tool for integrating machine learning into mechanical simulation. While this approach is expected to provide higher accuracy in reproducing realistic material behavior, the reliability of the solution process and the convergence conditions must be paid special attention. While the theory of the model is explained in [Heider et al. 2020], exemplary source code is also made available for interested readers [https://doi.org/10.25835/6n5uu50y]
Diffeomorphic Transformations for Time Series Analysis: An Efficient Approach to Nonlinear Warping
The proliferation and ubiquity of temporal data across many disciplines has sparked interest for similarity, classification and clustering methods specifically designed to handle time series data. A core issue when dealing with time series is determining their pairwise similarity, i.e., the degree to which a given time series resembles another. Traditional distance measures such as the Euclidean are not well-suited due to the time-dependent nature of the data. Elastic metrics such as dynamic time warping (DTW) offer a promising approach, but are limited by their computational complexity, non-differentiability and sensitivity to noise and outliers. This thesis proposes novel elastic alignment methods that use parametric \& diffeomorphic warping transformations as a means of overcoming the shortcomings of DTW-based metrics. The proposed method is differentiable \& invertible, well-suited for deep learning architectures, robust to noise and outliers, computationally efficient, and is expressive and flexible enough to capture complex patterns. Furthermore, a closed-form solution was developed for the gradient of these diffeomorphic transformations, which allows an efficient search in the parameter space, leading to better solutions at convergence. Leveraging the benefits of these closed-form diffeomorphic transformations, this thesis proposes a suite of advancements that include: (a) an enhanced temporal transformer network for time series alignment and averaging, (b) a deep-learning based time series classification model to simultaneously align and classify signals with high accuracy, (c) an incremental time series clustering algorithm that is warping-invariant, scalable and can operate under limited computational and time resources, and finally, (d) a normalizing flow model that enhances the flexibility of affine transformations in coupling and autoregressive layers.
Automatic Differentiation of Algorithms for Machine Learning
Baydin, Atilim Gunes, Pearlmutter, Barak A.
Automatic differentiation---the mechanical transformation of numeric computer programs to calculate derivatives efficiently and accurately---dates to the origin of the computer age. Reverse mode automatic differentiation both antedates and generalizes the method of backwards propagation of errors used in machine learning. Despite this, practitioners in a variety of fields, including machine learning, have been little influenced by automatic differentiation, and make scant use of available tools. Here we review the technique of automatic differentiation, describe its two main modes, and explain how it can benefit machine learning practitioners. To reach the widest possible audience our treatment assumes only elementary differential calculus, and does not assume any knowledge of linear algebra.